51 research outputs found
Pre and Post-hoc Diagnosis and Interpretation of Malignancy from Breast DCE-MRI
We propose a new method for breast cancer screening from DCE-MRI based on a
post-hoc approach that is trained using weakly annotated data (i.e., labels are
available only at the image level without any lesion delineation). Our proposed
post-hoc method automatically diagnosis the whole volume and, for positive
cases, it localizes the malignant lesions that led to such diagnosis.
Conversely, traditional approaches follow a pre-hoc approach that initially
localises suspicious areas that are subsequently classified to establish the
breast malignancy -- this approach is trained using strongly annotated data
(i.e., it needs a delineation and classification of all lesions in an image).
Another goal of this paper is to establish the advantages and disadvantages of
both approaches when applied to breast screening from DCE-MRI. Relying on
experiments on a breast DCE-MRI dataset that contains scans of 117 patients,
our results show that the post-hoc method is more accurate for diagnosing the
whole volume per patient, achieving an AUC of 0.91, while the pre-hoc method
achieves an AUC of 0.81. However, the performance for localising the malignant
lesions remains challenging for the post-hoc method due to the weakly labelled
dataset employed during training.Comment: Submitted to Medical Image Analysi
2D Image head pose estimation via latent space regression under occlusion settings
Head orientation is a challenging Computer Vision problem that has been
extensively researched having a wide variety of applications. However, current
state-of-the-art systems still underperform in the presence of occlusions and
are unreliable for many task applications in such scenarios. This work proposes
a novel deep learning approach for the problem of head pose estimation under
occlusions. The strategy is based on latent space regression as a fundamental
key to better structure the problem for occluded scenarios. Our model surpasses
several state-of-the-art methodologies for occluded HPE, and achieves similar
accuracy for non-occluded scenarios. We demonstrate the usefulness of the
proposed approach with: (i) two synthetically occluded versions of the BIWI and
AFLW2000 datasets, (ii) real-life occlusions of the Pandora dataset, and (iii)
a real-life application to human-robot interaction scenarios where face
occlusions often occur. Specifically, the autonomous feeding from a robotic
arm
Efficient Optimization Algorithm for Space-Variant Mixture of Vector Fields
This paper presents a new algorithm for trajectory classifi- cation of human activities. The presented framework uses a mixture of parametric space-variant vector fields to describe pedestrian’s trajecto- ries. An advantage of the proposed method is that the vector fields are not constant and depend on the pedestrian’s localization. This means that the switching motion among vector fields may occur at any image location and should be accurately estimated. In this paper, the model is equipped with a novel methodology to estimate the switching probabilities among motion regimes. More specifically, we propose an iterative optimization of switching probabilities based on the natural gradient vector, with respect to the Fisher information metric. This approach follows an information geometric framework and contrasts with more traditional approaches of constrained optimization in which euclidean gradient based methods are used combined with probability simplex constraints. We testify the per- formance superiority of the proposed approach in the classification of pedestrian’s trajectories in synthetic and real data sets concerning farfield surveillance scenarios
Training Medical Image Analysis Systems like Radiologists
The training of medical image analysis systems using machine learning
approaches follows a common script: collect and annotate a large dataset, train
the classifier on the training set, and test it on a hold-out test set. This
process bears no direct resemblance with radiologist training, which is based
on solving a series of tasks of increasing difficulty, where each task involves
the use of significantly smaller datasets than those used in machine learning.
In this paper, we propose a novel training approach inspired by how
radiologists are trained. In particular, we explore the use of meta-training
that models a classifier based on a series of tasks. Tasks are selected using
teacher-student curriculum learning, where each task consists of simple
classification problems containing small training sets. We hypothesize that our
proposed meta-training approach can be used to pre-train medical image analysis
models. This hypothesis is tested on the automatic breast screening
classification from DCE-MRI trained with weakly labeled datasets. The
classification performance achieved by our approach is shown to be the best in
the field for that application, compared to state of art baseline approaches:
DenseNet, multiple instance learning and multi-task learning.Comment: Oral Presentation at MICCAI 201
MDF-Net for Abnormality Detection by Fusing X-Rays with Clinical Data
This study investigates the effects of including patients' clinical
information on the performance of deep learning (DL) classifiers for disease
location in chest X-ray images. Although current classifiers achieve high
performance using chest X-ray images alone, our interviews with radiologists
indicate that clinical data is highly informative and essential for
interpreting images and making proper diagnoses.
In this work, we propose a novel architecture consisting of two fusion
methods that enable the model to simultaneously process patients' clinical data
(structured data) and chest X-rays (image data). Since these data modalities
are in different dimensional spaces, we propose a spatial arrangement strategy,
spatialization, to facilitate the multimodal learning process in a Mask R-CNN
model. We performed an extensive experimental evaluation using MIMIC-Eye, a
dataset comprising modalities: MIMIC-CXR (chest X-ray images), MIMIC IV-ED
(patients' clinical data), and REFLACX (annotations of disease locations in
chest X-rays).
Results show that incorporating patients' clinical data in a DL model
together with the proposed fusion methods improves the disease localization in
chest X-rays by 12\% in terms of Average Precision compared to a standard Mask
R-CNN using only chest X-rays. Further ablation studies also emphasize the
importance of multimodal DL architectures and the incorporation of patients'
clinical data in disease localization. The architecture proposed in this work
is publicly available to promote the scientific reproducibility of our study
(https://github.com/ChihchengHsieh/multimodal-abnormalities-detection
Dual Semantic Fusion Network for Video Object Detection
Video object detection is a tough task due to the deteriorated quality of
video sequences captured under complex environments. Currently, this area is
dominated by a series of feature enhancement based methods, which distill
beneficial semantic information from multiple frames and generate enhanced
features through fusing the distilled information. However, the distillation
and fusion operations are usually performed at either frame level or instance
level with external guidance using additional information, such as optical flow
and feature memory. In this work, we propose a dual semantic fusion network
(abbreviated as DSFNet) to fully exploit both frame-level and instance-level
semantics in a unified fusion framework without external guidance. Moreover, we
introduce a geometric similarity measure into the fusion process to alleviate
the influence of information distortion caused by noise. As a result, the
proposed DSFNet can generate more robust features through the multi-granularity
fusion and avoid being affected by the instability of external guidance. To
evaluate the proposed DSFNet, we conduct extensive experiments on the ImageNet
VID dataset. Notably, the proposed dual semantic fusion network achieves, to
the best of our knowledge, the best performance of 84.1\% mAP among the current
state-of-the-art video object detectors with ResNet-101 and 85.4\% mAP with
ResNeXt-101 without using any post-processing steps.Comment: 9 pages,6 figure
- …